gh-144766: Fix flaky test_trampoline_works_with_forks#148056
gh-144766: Fix flaky test_trampoline_works_with_forks#148056yonatan-genai wants to merge 4 commits intopython:mainfrom
Conversation
The fork child ran full Python finalization instead of calling os._exit(0), which is fragile when perf trampoline support is active (unmapping executable memory and unregistering code watchers during finalization can crash intermittently). The newer test added in the same file (test_trampoline_works_after_fork_with_many_code_objects) already uses os._exit(0) for this reason. Also fix the parent's wait status handling: os.waitpid returns a raw wait status, not an exit code. Use os.WEXITSTATUS to extract the actual exit code, and check os.WIFSIGNALED for signal deaths. <claude>
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
os._exit() does not flush Python IO buffers, so the print(os.getpid()) output was lost when stdout is piped. Add flush=True to ensure the child PID reaches the parent. Also add the required NEWS entry. <claude>
| @@ -0,0 +1,4 @@ | |||
| Fix flaky :func:`test_trampoline_works_with_forks` by using ``os._exit(0)`` | |||
There was a problem hiding this comment.
Roles like func can only link against objects Sphinx knows of. Test functions aren't usually documented.
There was a problem hiding this comment.
I don't think this needs a news entry anyway.
| Fix flaky :func:`test_trampoline_works_with_forks` by using ``os._exit(0)`` | ||
| in the fork child to avoid fragile Python finalization, and by flushing | ||
| stdout before exit. Also fix the parent's wait status handling to use | ||
| ``os.WEXITSTATUS()``. |
There was a problem hiding this comment.
| ``os.WEXITSTATUS()``. | |
| :func:`os.WEXITSTATUS`. |
Might work, though.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Removed the NEWS entry as this is a test-only change. Thanks @StanFromIreland for pointing that out. |
Summary
test_trampoline_works_with_forksintermittently fails on macOS CI because the fork child runs full Python finalization instead of callingos._exit(0). Perf trampoline finalization in a forked child is fragile: it unmaps executable memory and unregisters code watchers while code objects are being destroyed, which can crash intermittently.The newer test in the same file (
test_trampoline_works_after_fork_with_many_code_objects, added in gh-144766) already usesos._exit(0)in the child for exactly this reason. This applies the same pattern to the older test.Also fixes the parent's wait status handling:
os.waitpidreturns a raw 16-bit wait status, not an exit code. The old code passed this raw status tosys.exit(), which could produce incorrect exit codes. Now usesos.WEXITSTATUS()to extract the actual exit code andos.WIFSIGNALED()to detect signal deaths.Changes
Lib/test/test_perf_profiler.py: Addos._exit(0)in fork child, fix wait status handling in parentDiscovered while testing #148050 (unrelated multiprocessing change that was blocked by this flaky test on macOS CI).